<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<title>SQLObject and Database Programming in Python</title>
<meta name="version" content="S5 1.0" />
<link rel="stylesheet" href="ui/slides.css" type="text/css" media="projection" id="slideProj" />
<link rel="stylesheet" href="ui/opera.css" type="text/css" media="projection" id="operaFix" />
<link rel="stylesheet" href="ui/print.css" type="text/css"
media="print" id="slidePrint" />
<link rel="stylesheet" href="ui/custom.css" type="text/css"/>
<script src="ui/slides.js" type="text/javascript"></script>
</head>
<body>

<div class="layout">

<div id="currentSlide"></div>
<div id="header"></div>
<div id="footer">
<h1>ChiPy, 11 November 2004</h1>
<h2>SQLObject and Database Programming in Python</h2>
<div id="controls"></div>
</div>

</div>


<div class="presentation">

<div class="slide">
<h1>SQLObject and Database Programming in Python</h1>
<h3>Ian Bicking</h3>
<h4>Imaginary Landscape</h4>
</div>

<!-- ======================================== -->

<div class="slide">
<h1>The Ongoing Example</h1>

<h2>Address book:</h2>
<p>
<ul>
 <li><code>person</code>:
 <ul>
  <li><code>first_name</code></li>
  <li><code>last_name</code></li>
  <li><code>email</code> (many emails for one person)</li>
 </ul></li>
 <li><code>email</code>:
 <ul>
  <li><code>type</code></li>
  <li><code>address</code></li>
 </ul></li>
</ul>
</p>
</div>

<!-- ======================================== -->

<div class="slide">
<h1>The DB API</h1>

<p>Database programming in Python is based on the <b><a
href="http://www.python.org/peps/pep-0249.html">DB-API</a></b>.
Some supported databases:

<ul>
<li>MySQL</li>
<li>PostgreSQL (multiple drivers)</li>
<li>SQLite</li>
<li>Firebird (Interbase)</li>
<li>MAXDB (SAP DB)</li>
<li>Oracle, Sybase, MS SQL Server</li>
</p>
</div>

<!-- ======================================== -->

<div class="slide">
<h1>DB API Example</h1>

<p>
Example:

<pre>
import MySQLdb
conn = MySQLdb.connect(
    db=<span class="string">'test'</span>, username=<span class="string">'ianb'</span>)
cur = conn.cursor()
cur.execute(
    <span class="sql">"SELECT id, first_name, last_name FROM person"</span>)
result = cur.fetchall()
for id, first_name, last_name in result:
    print <span class="string">"%2i %s, %s"</span> % (id, last_name, first_name)
</pre>

</p>
</div>

<!-- ======================================== -->

<div class="slide">
<h1>DB API</h1>

<ul>
<li> Import the database module (<code>MySQLdb</code>,
<code>psycopg</code>, <code>sqlite</code>, etc) </li>
<li> Use <code>module.connect(...)</code> to create a
connection. </li>
<li> Use <code>connection.cursor()</code> to get a cursor.  Cursors do
all the work. </li>
<li> Use <code>cursor.execute(sql_query)</code> to run
something. </li>
<li> Use <code>cursor.fetchall()</code> to get results. </li> 

</div>

<!-- ======================================== -->

<div class="slide">
<h1>DB API problems</h1>

<ol>
<li> Connections and cursors are tedious. </li>
<li> The databases don't all work the same; multiple SQL
dialects. </li>
<li> Lots of time spent writing SQL. </li>
<li> Database updates can effect all the SQL you've written. </li>
<li> Unless you make abstractions... </li>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Abstractions</h1>

<p>
<ul>
<li> Every significant database application has a database abstraction
layer. </li>
<li> You write one whether you mean to or not. </li>
</p>

</div>



<!-- ======================================== -->

<div class="slide">
<h1>Object-Relational Mappers</h1>

<p>

The table:

<table border=1>

<tr><th colspan=3>person</th></tr>

<tr>
 <th>id</th>
 <th>first_name</th>
 <th>last_name</th>
</tr>

<tr>
 <td>1</td>
 <td>John</td>
 <td>Doe</td>
</tr>
<tr>
 <td>2</td>
 <td>Tom</td>
 <td>Jones</td>
</tr>
<tr>
 <td colspan=3 align=center>...</td>
</tr>
</table>

<p>

</div>

<!-- ======================================== -->

<div class="slide">

<h1>ORMs: Classes</h1>

<p>

The class:

<table width="100%">
<tr><td>
<table border=1>

<tr class="code-accent"><th colspan=3>person</th></tr>

<tr class="code-accent">
 <th>id</th>
 <th>first_name</th>
 <th>last_name</th>
</tr>

<tr>
 <td>1</td>
 <td>John</td>
 <td>Doe</td>
</tr>
<tr>
 <td>2</td>
 <td>Tom</td>
 <td>Jones</td>
</tr>
<tr>
 <td colspan=3 align=center>...</td>
</tr>

</table>
</td>

<td>
<pre>
class Person(SQLObject):
    <span class="comment"># id is implicit</span>
    first_name = StringCol()
    last_name = StringCol()
</pre>
</td></tr></table>

</div>

<!-- ======================================== -->


<div class="slide">

<h1>ORMs: Instances</h1>

<p>

An instance:

<table width="100%">
<tr><td>

<table border=1>

<tr><th colspan=3>person</th></tr>

<tr>
 <th>id</th>
 <th>first_name</th>
 <th>last_name</th>
</tr>

<tr class="code-accent">
 <td>1</td>
 <td>John</td>
 <td>Doe</td>
</tr>
<tr>
 <td>2</td>
 <td>Tom</td>
 <td>Jones</td>
</tr>
<tr>
 <td colspan=3 align=center>...</td>
</tr>

</table>

</td><td>

<pre>
<span class="prompt">&gt;&gt;&gt;</span> john = Person.get(1)
<span class="prompt">&gt;&gt;&gt;</span> john.first_name
<span class="output">'John'</span>
</pre>

</td></tr></table>
</p>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>ORM: summary</h1>

<ul>
 <li>Every table is a class</li>
 <li>Every row is an instance</li>
 <li>Columns become attributes</li>
</ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Usage</h1>

<pre>
__connection__ = <span class="string">'postgres://pgsql@localhost/test'</span>
class Person(SQLObject):
    first_name = StringCol()
    last_name = StringCol()
    emails = MultipleJoin('Email')
</pre>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Usage (classes)</h1>

<pre>
class Email(SQLObject):
    person = ForeignKey(<span class="string">'Person'</span>)
    type = EnumCol(['home', 'work'])
    address = StringCol()
</pre>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Usage (creating tables)</h1>

<pre>
def createTables():
    for table in (Person, Email):
        table.createTable(ifNotExists=True)
</pre>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Usage (instances)</h1>

Using your classes:

<pre>
<span class="prompt">&gt;&gt;&gt;</span> john = Person(first_name=<span class="string">'John'</span>, last_name=<span class="string">'Doe'</span>)
<span class="prompt">&gt;&gt;&gt;</span> email = Email(person=john, type=<span class="string">'home'</span>, 
<span class="prompt">...</span>     address=<span class="string">'john@work.com'</span>)
<span class="prompt">&gt;&gt;&gt;</span> john.emails
<span class="output">[]</span>
</pre>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Usage (instances)</h1>

<pre>
<span class="prompt">&gt;&gt;&gt;</span> tom = Person(first_name=<span class="string">'Tom'</span>, last_name=<span class="string">'Jones'</span>)
<span class="prompt">&gt;&gt;&gt;</span> tom is Person.get(tom.id)
<span class="output">True</span>
<span class="prompt">&gt;&gt;&gt;</span> list(Person.selectBy(first_name=<span class="string">'John'</span>))
<span class="output">[]</span>
</pre>

</div>


<!-- ======================================== -->


<div class="slide">
<h1>Defining Classes</h1>

<ul>
 <li> The class name matches the table name </li>
 <li> The attributes match the column names </li>
 <li> (kind of...) </li>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Defining Classes...</h1>

<p>
You can add extra methods:

<pre>
class Person(SQLObject):
    ...
    def _get_name(self):
        return self.first_name + ' ' + self.last_name
</pre>

<pre>
<span class="prompt">&gt;&gt;&gt;</span> tom.name
<span class="output">'Tom Jones'</span>
</pre>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Automatic properties</h1>

<ul>
<li> <code>_get_attr()</code> methods are called whenever the
<code>obj.attr</code> attribute is called </li>
<li> <code>_set_attr()</code> methods are called whenever the
<code>obj.attr = value</code> statement is run </li>
<li> <code>_set_attr()</code> is optional </li>
</ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Defining classes...</h1>

<p>
You can override columns:

<pre>
class Person(SQLObject):
    ...
    last_name_searchable = StringCol()
    def _set_last_name(self, value):
        self._SO_set_last_name(self, value)
        self.last_name_lower = re.sub(<span class="string">r'[^a-zA-Z]'</span>, <span class="string">''</span>, value).lower()
</pre>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Defining classes...</h1>

<p>
You can fiddle with the naming:

<pre>
class Person(SQLObject):
    _table = <span class="string">"people"</span>
    first_name = StringCol(dbName=<span class="string">"fname"</span>)
    ...
</pre>

</p>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Connecting classes</h1>

<p>
Foreign Keys:

<pre>
class Email(SQLObject):
    person = ForeignKey(<span class="string">'Person'</span>)
    ...
</pre>

Note we use a string for <code class="string">'Person'</code>.


</div>

<!-- ======================================== -->


<div class="slide">
<h1>Backreferences</h1>

<p>
The other side of one-to-many:

<pre>
class Person(SQLObject):
    ...
    emails = MultipleJoin(<span class="string">'Email'</span>)
</pre>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Many-to-many</h1>

<p>
Many to many relationships imply a "hidden" table:

<pre>
class Address(SQLObject):
    people = RelatedJoin(<span class="string">'Person'</span>)
    ...
class Person(SQLObject):
    addresses = RelatedJoin(<span class="string">'Address'</span>)
</pre>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Many-to-many...</h1>

<p>
The intermediate table created:

<pre>
CREATE TABLE address_person (
  address_id INT NOT NULL,
  person_id INT NOT NULL
);
</pre>
</p>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Lazy classes</h1>

<p>
SQLObject can use reflection to figure out what columns your table
has:

<pre>
class Person(SQLObject):
    _fromDatabase = True
</pre>

You can start with <code>_fromDatabase = True</code> and add 
in explicit columns, overriding defaults.

</p>

</div>

<!-- ======================================== -->



<div class="slide">
<h1>Instances</h1>

<ul>
<li>Use <code>Person.get(id)</code> to retrieve a row</li>
<li>Use <code>Person(first_name=...)</code> to insert a row (the 
new object is returned)</li>
<li>Use <code>aPerson.destroySelf()</code> to delete a row</li>
</ul>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Updating</h1>

<ul>
<li><code>aPerson.first_name = <span
class="string">"new_name"</span></code><br>
 <code class="sql">UPDATE</code> is executed immediately </li>
<li> Update multiple columns at once: <br>
<code>aPerson.set(first_name=<span
class="string">"new_fname"</span>, last_name=<span
class="string">"new_lname"</span>)</code>  </li>
</ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Selecting</h1>

<p>
You can use Python expressions (kind of):

<pre>
query = Person.q.first_name == <span class="string">"Joe"</span>
Person.select(query)
</pre>

</p>


</div>

<!-- ======================================== -->


<div class="slide">
<h1>Selecting...</h1>

<p>

<ul>
<li><code>Class.<b style="color: #990000">q</b>.column_name</code> creates a magical
object that creates queries.</li>
<li><code>sqlrepr()</code> turns these query objects into SQL (mostly
hidden)</li>
<li><code>sqlrepr(Person.q.first_name == <span
class="string">"Joe"</span>, <span class="string">'mysql')</code>
creates the SQL <code class="sql">person.first_name =
'Joe'</code></li>

</p>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Selecting...</h1>

<p>
Complicated joins are possible:

<pre>
Person.select((Person.q.id == Email.q.personID)
              & (Email.q.address.startswith('joe')))
</pre>

Becomes:

<pre>
SELECT person.id, person.first_name, person.last_name
FROM person, email
WHERE person.id = email.person_id
      AND email.address LIKE 'joe%'
</pre>

</p>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>SelectResult</h1>

Select results are more than just lists:

<ul>
<li> You can slice them, and <code class="sql">LIMIT</code> and <code
class="sql">OFFSET</code> statements are added to your query</li>
<li> You can iterate through them, and avoid loading all objects into
memory </li>
<li> You can do things like <code>select_result.count()</code> to run
aggregate functions </li>
<li> To get a real list, you must do <code>list(select_result)</code></li>

</ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Caching</h1>

<ul>
<li>SQLObject caches everything</li>
<li>Maybe too much (bad if you have multiple updaters)</li>
<li>Needs to because there's no joins</li>
<li>Has potential to be faster than ad hoc queries</li>
<li>Thread safety or thread confusion?</li>
</ul>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Performance</h1>

<ul>
<li> Not good if you are dealing with lots and lots of rows </li>
<li> Inserts and selects substantially slower </li>
<li> But for many applications you'll pay that price sooner or later
anyway </li>
</ul>

</div>


<!-- ======================================== -->

<div class="slide">
<h1>The problem with ORMs</h1>

<p>
ORMs live in the world between Object Oriented programmers (and
programs) and Relational programmers (and programs).  Neither will be
happy.
</p>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>ORM Mismatch</h1>

<ul>
 <li> Python classes aren't tables</li>
 <li> Python instances aren't rows</li>
<ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>ORM Mismatch: Classes</h1>

<ul>
 <li> Databases don't have inheritance </li>
 <li> Even when they do, no one uses it </li>
 <li> Some tables are too boring (e.g., intermediate tables for
 many-to-many relationships), or don't warrant the overhead </li>
</ul>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>ORM Mismatch: Instances</h1>

<ul>
 <li> Instances get garbage collected, rows are forever </li>
 <li> Rows exist even when instances don't </li>
 <li> Instances may still exist when the row is gone </li>
 <li> In the relational calculus, rows don't have real identity </li>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>ORM Mismatch: Relations</h1>

<ul>
 <li> Things like joins don't make sense for objects, but are central
 to relations </li>
 <li> Views and stored procedures also don't make sense </li>
 <li> Aggregate functions and set operations aren't natural in Python 
 (the <code>for</code> loop is natural) </li>
</ul>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Solving the question by avoidance</h1>

<ul>
 <li> SQLObject doesn't let you forget SQL </li>
 <li> You still have to think relationally </li>
 <li> But many relational concepts don't work either </li>
 <li> Advantage: it's easy </li>
</ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Solving the Mismatch</h1>

SQLObject's answer: don't try too hard.

<ul>
 <li> There exist database restrictions; not every database schema maps
 well.  (But most are fine) </li>
 <li> Sometimes you have to do things more slowly or more imperatively,
 compared to a purely relational technique </li>
 <li> There exist object restrictions; not every object-oriented
 concept maps well; e.g., no inheritance </li>
 <li> Inheritance isn't planned; this isn't transparent persistence </li>
</ul>

</div>

<!-- ======================================== -->

<div class="slide">
<h1>Some other alternatives</h1>

Several Python ORMs exist; contrasts: 
<ul>
 <li> SQLObject doesn't try to be too OO, so it works well with normal 
 database schemas </li>
 <li> SQLObject doesn't have a compile step </li>
 <li> SQLObject isn't <i>just</i> a SQL wrapper </li>
 <li> SQLObject is reasonably complete </li>
</ul>

</div>

<!-- ======================================== -->


<div class="slide">
<h1>Other alternatives...</h1>

<p>

See the Python Wiki at: <a href="http://python.org/moin">python.org/moin</a><br>
The page is <a href="http://python.org/moin/HigherLevelDatabaseProgramming">HigherLevelDatabaseProgramming</a>.
</p>

</div>

<!-- ======================================== -->



</div>

</body>
</html>
